Format-preserving cryptographic systems

ABSTRACT

Key requests in a data processing system may include identifiers such as user names, policy names, and application names. The identifiers may also include validity period information indicating when corresponding keys are valid. When fulfilling a key request, a key server may use identifier information from the key request in determining which key access policies to apply and may use the identifier in determining whether an applicable policy has been satisfied. When a key request is authorized, the key server may generate a key by applying a one-way function to a root secret and the identifier. Validity period information for use by a decryption engine may be embedded in data items that include redundant information. Application testing can be facilitated by populating a test database with data that has been encrypted using a format-preserving encryption algorithm. Parts of a data string may be selectively encrypted based on their sensitivity.

This application is a division of patent application Ser. No.11/654,054, filed Jan. 16, 2007, which is hereby incorporated byreference herein in its entirety. This application claims the benefit ofand claims priority to patent application Ser. No. 11/654,054, filedJan. 16, 2007.

BACKGROUND OF THE INVENTION

This invention relates to cryptography and more particularly, topreserving data formats during encryption and decryption operations.

Cryptographic systems are used to secure data in a variety of contexts.For example, encryption algorithms are used to encrypt sensitiveinformation such as financial account numbers, social security numbers,and other personal information. By encrypting sensitive data prior totransmission over a communications network, the sensitive data issecured, even if it passes over an unsecured communications channel.Sensitive data is also sometimes encrypted prior to storage in adatabase. This helps to prevent unauthorized access to the sensitivedata by an intruder.

Commonly used encryption algorithms include the Advanced EncryptionStandard (AES) encryption algorithm and the Data Encryption Standard(DES) encryption algorithm. Using these types of algorithms, anorganization that desires to secure a large quantity of sensitiveinformation can place the sensitive information in a data file. The datafile can then be encrypted in its entirety using the AES or DESalgorithms.

Encrypting entire files of data can be an effective technique forsecuring large quantities of data. However, bulk encryption of files canbe inefficient and cumbersome because it is not possible to selectivelyaccess a portion of the encrypted data in an encrypted file. Even if anapplication only needs to have access to a portion of the data, theentire file must be decrypted. Without the ability to selectivelydecrypt part of a file, it can be difficult to design a data processingsystem that provides different levels of data access for differentapplication programs and for different personnel.

To avoid the difficulties associated with encrypting entire files ofsensitive data, it would be desirable to be able to apply cryptographictechniques such as the AES and DES encryption algorithms with a finerdegree of granularity. For example, it might be desirable toindividually encrypt social security numbers in a database table, ratherthan encrypting the entire table. This would allow software applicationsthat need to access unsensitive information in the table to retrieve thedesired information without decrypting the entire table.

Conventional encryption techniques can, however, significantly alter theformat of a data item. For example, encryption of a numeric string suchas a credit card number may produce a string that contains non-numericcharacters or a string with a different number of characters. Becausethe format of the string is altered by the encryption process, it maynot be possible to store the encrypted string in the same type ofdatabase table that is used to store unencrypted versions of the string.The altered format of the encrypted string may therefore disruptsoftware applications that need to access the string from a database.The altered format may also create problems when passing the encryptedstring between applications. Because of these compatibility problems,organizations may be unable to incorporate cryptographic capabilitiesinto legacy data processing systems.

It would therefore be desirable to be able to provide cryptographictools that are capable of encrypting and decrypting data withoutaltering the format of the data.

SUMMARY OF THE INVENTION

In accordance with the present invention, a data processing system isprovided in which a format-preserving cryptographic function may be usedfor format-preserving encryption operations and format-preservingdecryption operations. The data processing system may include a keyserver. The key server may provide cryptographic keys to authorized keyrequesters. The key server may use policy rules to determine which keyrequesters are authorized to obtain a copy of a given key. If a keyrequester is authorized, the key server may generate the requested keyand may provide the key to the key requester over a communicationsnetwork.

Key requests may include identifiers. Identifiers help to identify keyrequesters and key requests. Suitable identifiers may include user namessuch as the name of an individual, the name of an organization, the nameof a group, etc. Policy names and program names may also be used asidentifiers.

If desired, key validity period information may be included in anidentifier. With one suitable arrangement, data to be encrypted ordecrypted using a key is credit card data and the validity periodinformation is a credit card expiration date.

Using a format-preserving encryption function, plaintext may beencrypted to form ciphertext. Validity period information may beembedded in the ciphertext for use in requesting and generating anappropriate decryption key. The validity period information may beembedded by combining an index value that corresponds to a particularvalidity period with redundant information such as a checksum value in acredit card number. Upon receipt of the ciphertext containing theembedded validity period information, an application can extract theembedded validity period information. The extracted validity periodinformation can be used in selecting an appropriate key to use inresponding to the key request, so information such as the validityperiod information may sometimes be referred to as key selectorinformation or a key selector.

In a data processing system including multiple applications that accessa common database, testing can be facilitated by using aformat-preserving encryption engine to encrypted sensitive data prior totesting. In a normal production environment for the data processingsystem, multiple applications access a production database that containssensitive data. Proper testing of applications in a test environmentrequires that the format of the data be preserved. The format-preservingencryption engine is used to encrypt the sensitive items in theproduction database. The encrypted versions of the sensitive data itemsare then exported into a test version of the database. The applicationscan be tested using the encrypted data in the test database.

A plaintext string may include multiple plaintext parts. Each plaintextpart may have a different sensitivity level. In this type of situation,it may be desirable to provide access to different parts of theplaintext to different applications or entities. By selectivelyencrypting each plaintext part, access can be controlled. Encryptionkeys for encrypting each part can be formed using the results of earlierencryption operations. In this way, a second plaintext part may berandomized relative to a first plaintext part during encryption, a thirdplaintext part may be randomized relative to the second plaintext partduring encryption, etc.

Further features of the invention, its nature and various advantageswill be more apparent from the accompanying drawings and the followingdetailed description of the preferred embodiments.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a diagram of an illustrative system environment in whichcryptographic tools with format-preserving encryption and decryptionfeatures may be used in accordance with an embodiment of the presentinvention.

FIG. 2 is a diagram showing how encryption and decryption enginespreserve the format of a string in accordance with an embodiment of thepresent invention.

FIG. 3 is a diagram of an illustrative format-preserving block cipherthat may be used during data encryption and decryption in accordancewith an embodiment of the present invention.

FIG. 4 is a flow chart of illustrative steps that may be used in settingup format-preserving encryption and decryption engines for use in a dataprocessing system of the type shown in FIG. 1 in accordance with anembodiment of the present invention.

FIG. 5 is a flow chart of illustrative steps involved in using aformat-preserving encryption engine to encrypt a data string inaccordance with an embodiment of the present invention.

FIG. 6 is a flow chart of illustrative steps involved in using aformat-preserving decryption engine to decrypt a data string inaccordance with an embodiment of the present invention.

FIG. 7A is a flow chart of illustrative steps involved in generating akey that is based on an identifier in accordance with an embodiment ofthe present invention.

FIG. 7B is a flow chart of illustrative steps involved in generating akey and storing the generated key with an association between the storedkey and an identifier in accordance with an embodiment of the presentinvention.

FIG. 8 is a flow chart of illustrative steps involved in requesting andobtaining a key from a key server in accordance with an embodiment ofthe present invention.

FIG. 9 is a flow chart of illustrative steps involved in requesting andobtaining a key from a key server in accordance with another embodimentof the present invention.

FIG. 10 is a flow chart of illustrative steps involved in requesting andobtaining a key from a key server in accordance with yet anotherembodiment of the present invention.

FIG. 11 is a diagram showing how validity period information can beembedded within a credit card number during format-preserving encryptionoperations in accordance with an embodiment of the present invention.

FIG. 12 is a diagram of an illustrative system in whichformat-preserving encryption and decryption operations are performed inaccordance with an embodiment of the present invention.

FIG. 13 is a flow chart of illustrative steps involved with encryptingand decrypting a credit card number using format-preservingcryptographic techniques in which validity period information isembedded in the checksum digit of the credit card number in accordancewith an embodiment of the present invention.

FIG. 14 is a diagram showing how a format-preserving encryption enginemay be used to encrypt data before the data is exported from aproduction database in a production environment to a test database in atest environment in accordance with an embodiment of the presentinvention.

FIG. 15 is a diagram showing how different parts of a data item such asa credit card number can be divided into different plaintext parts forselective encryption in accordance with an embodiment of the presentinvention.

FIG. 16 is a diagram showing how three plaintext parts of a data stringcan be encrypted using four cryptographic keys in accordance with anembodiment of the present invention.

FIG. 17 is a diagram showing how three plaintext parts of a data stringcan be encrypted using three cryptographic keys in accordance with anembodiment of the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

An illustrative cryptographic system 10 in accordance with the presentinvention is shown in FIG. 1. System 10 includes computing equipment 12and communications network 14. The computing equipment 12 may includeone or more personal computers, workstations, computers configured asservers, mainframe computers, portable computers, etc. Thecommunications network 14 may be a local area network or a wide areanetwork such as the internet. System 10 may be used in processing datafor one or more organizations.

Computing equipment 12 may be used to support applications 16 anddatabases 18. In computing equipment 12 in which multiple applicationsrun on the same computer platform, applications and databases maycommunicate with each other directly. If desired, applications 16 cancommunicate with each other and with databases 18 remotely usingcommunications network 14. For example, an application 16 that is run ona computer in one country may access a database 18 that is located inanother country or an application 16 running on one computer may usenetwork 14 to transmit data to an application 16 that is running onanother computer. Applications 16 may be any suitable applications, suchas financial services applications, governmental record managementapplications, etc.

The data that is handled by system 10 includes sensitive items such asindividuals' addresses, social security numbers and other identificationnumbers, license plate numbers, passport numbers, financial accountnumbers such as credit card and bank account numbers, telephone numbers,email addresses, etc. In some contexts, information such as individuals'names may be considered sensitive.

In a typical scenario, a credit card company maintains a database 18 ofaccount holders. The database lists each account holder's name, address,credit card number, and other account information. Representatives ofthe credit card company may be located in many different geographiclocations. The representatives may use various applications 16 to accessthe database. For example, a sales associate may retrieve telephonenumbers of account holders to make sales calls using one application,whereas a customer service representative may retrieve account balanceinformation using another application. Automated applications such aserror-checking housekeeping applications may also require access to thedatabase.

To prevent unauthorized access to sensitive data and to comply with dataprivacy regulations and other restrictions, sensitive data may need tobe encrypted. Encryption operations may be performed before data ispassed between applications 16 or before data is stored in a database18. Because various applications may need to access different types ofdata, the system 10 preferably allows data to be selectively encrypted.As an example, each of the telephone numbers and each of the credit cardnumbers can be individually encrypted using separate cryptographic keys.With this type of selective encryption arrangement, applications thatrequire access to telephone numbers need not be provided with access tocredit card numbers and vice versa.

To support encryption and decryption operations in system 10applications 16 may be provided with encryption and decryption engines.For example, an application 16 that accesses a database 18 over acommunications network 14 may have an encryption engine for encryptingsensitive data before it is provided to the database 18 and stored andmay have a decryption engine for use in decrypting encrypted data thathas been retrieved from database 18 over communications network 14. Asanother example, a first application may have an encryption engine forencrypting sensitive data before passing the encrypted data to a secondapplication. The second application may have a decryption engine fordecrypting the encrypted data that has been received from the firstapplication.

Any suitable technique may be used to provide applications 16 withencryption and decryption capabilities. For example, the encryption anddecryption engines may be incorporated into the software code of theapplications 16, may be provided as stand-alone applications that areinvoked from within a calling application, or may be implemented using adistributed arrangement in which engine components are distributedacross multiple applications and/or locations.

Key server 20 may be used to generate and store cryptographic keys thatare used by the encryption and decryption engines. Key server 20 mayinclude policy information 22 that key server 20 uses in determiningwhether to fulfill key requests. As an example, policy information 22may include a set of policy rules that dictate that keys should only bereleased if they have not expired and if the key requester'sauthentication credentials are valid.

In a typical scenario, an application requests a key from key server 22.When requesting the key, the application provides authenticationcredentials to the key server 20. The key server 20 provides theauthentication credentials to authentication server 24. Authenticationserver 24 verifies the authentication credentials and provides theresults of the verification operation to the key server overcommunications network 14. If the key requester is successfullyauthenticated and if the key server determines that the expirationperiod has not yet expired, the key server can satisfy the key requestby providing the requested key to the application over a secure path innetwork 14 (e.g., over a secure sockets layer link). Otherauthentication techniques and key request arrangements may be used ifdesired.

The data handled by the applications 16 and databases 18 of system 10 isrepresented digitally. The data includes strings of characters (i.e.,names, addresses, account numbers, etc.). As shown in FIG. 2, duringencryption operations, an encryption engine 26 encrypts unencryptedstrings of characters (sometimes referred to as plaintext) intoencrypted strings of characters (sometimes referred to as ciphertext).During decryption operations, a decryption engine 28 decrypts encryptedstrings of characters to form unencrypted strings of characters.

The data strings that are handled in a typical data processing systemhave defined formats. For example, an identification number may be madeup of three letters followed by ten digits. The encryption anddecryption engines of the present invention are able to encrypt anddecrypt strings without changing a string's format (i.e., so that aplaintext identification number made up of three letters followed by tendigits would be encrypted to form corresponding ciphertext make up ofthree letters and ten digits). The ability to preserve the format of adata string greatly simplifies system operations and allows systems withlegacy applications to be provided with cryptographic capabilities thatwould not be possible using conventional techniques.

Conventional encryption algorithms can alter the format of a stringduring encryption, so that it becomes difficult or impossible to use theencrypted version of the string. For example, it may be impossible tostore a conventionally-encrypted credit card number in a database tablethat has been designed to handle strings that contain only digits.

In accordance with the present invention, data stings can be encryptedand decrypted while preserving the format of the strings. Consider, asan example, the encryption and decryption of credit card numbers. Creditcard numbers generally have between 13 and 18 digits. The format for aparticular valid credit card number might require that the credit cardnumber have 16 digits. This type of credit card number will be describedas an example.

In a 16-digit credit card number, the digits are typically organized infour groups of four each, separated by three spaces. During aformat-preserving encryption operation, an unencrypted credit cardnumber such as “4408 0412 3456 7890” may be transformed intocredit-card-formatted ciphertext such as “4417 1234 5678 9114” andduring decryption, the ciphertext “4417 1234 5678 9114” may betransformed back into the unencrypted credit card number “4408 0412 34567890”.

The value of a valid sixteenth digit in a credit card number is formedby performing a checksum operation on the first 15 digits using theso-called Luhn algorithm. Any single-digit error in the credit cardnumber and most adjacent digit transpositions in the credit card numberwill alter the checksum value, so that data entry errors can beidentified.

During encryption operations, the encryption engine 26 can compute a newchecksum value using the first 15 digits of the ciphertext. The newchecksum digit can be used in the ciphertext or, if desired, policyinformation such as a validity period may be embedded within thechecksum digit by adding an appropriate validity period index value tothe new checksum value. When a validity period is embedded within achecksum digit, the resulting modified checksum value will generally nolonger represent a valid checksum for the string. However, applicationsin system 10 will be able to retrieve the validity period informationfrom the checksum digit and will be able to use the extracted validityperiod information in obtaining a decryption key from key server 20(FIG. 1).

This type of embedding operation may be used to store any suitableinformation within encrypted data. The use of credit card numbers, and,more particularly, the use of validity period information that has beenembedded within the checksum digits of credit card numbers are describedherein as examples.

Because encryption and decryption engines 26 and 28 of FIG. 2 canpreserve a desired format for a string during encryption and decryptionoperations, sensitive data can be secured without requiring entire filesto be encrypted.

The encryption and decryption engines 26 and 28 preferably use indexmappings to relate possible character values in a given string positionto corresponding index values in an index. By mapping string charactersto and from a corresponding index, the encryption and decryption engines26 and 28 are able to perform encryption and decryption while preservingstring formatting.

In a typical scenario, an index mapping may be formed using a tablehaving two columns and a number of rows. The first column of the mappingcorresponds to the potential character values in a given string position(i.e., the range of legal values for characters in that position). Thesecond column of the mapping corresponds to an associated index. Eachrow in the mapping defines an association between a character value anda corresponding index value.

Consider, as an example, a situation in which the string being encryptedhas first, fifth, sixth, and seventh string characters that are digitsand second, third, and fourth characters that are uppercase letters. Inthis situation, the possible character values in the first, fifth,sixth, and seventh character positions within the plaintext version ofthe string might range from 0 to 9 (i.e., the first character in thestring may be any digit from 0 through 9, the fifth character in thestring may be any digit from 0 to 9, etc.). The possible charactervalues in the second, third, and fourth positions in the string rangefrom A to Z (i.e., the second character in the unencrypted version ofthe string may be any uppercase letter in the alphabet from A to Z, thethird character in the unencrypted version of the string may be anyuppercase letter from A through Z, etc.).

The index mapping in this type of situation may map the ten possibledigit values for the first, fifth, sixth, and seventh string charactersinto ten corresponding index values (0 . . . 9). For the second, third,and fourth character positions, 26 possible uppercase letter values (A .. . Z) may be mapped to 26 corresponding index values (0 . . . 25).

In a typical string, not all characters have the same range of potentialcharacter values. If there are two ranges of potential character values,two index mappings may be used, each of which maps a different set ofpossible character values to a different set of index values. If thereare three ranges of potential character values within the string, threeindex mappings may be used. For example, a first index mapping mayrelate a digit character to a first index, a second index mapping mayrelate a uppercase letter character to a second index, and a third indexmapping may relate an alphanumeric character to a third index. Instrings that contain a larger number of different character types, moreindex mappings may be used.

In general, a string contains a number of characters N. The potentialcharacter values in the string are related to corresponding index valuesusing index mappings. An index mapping is created for each character.The indexes used to represent each character may have any suitable size.For example, an index containing 52 index values may be associated withstring characters with character values that span both the uppercase andlowercase letters. Because not all of the characters typically have thesame range of potential character values, there are generally at leasttwo different index mappings used to map character values in the stringto corresponding index values. In a string with N characters, N indexmappings are used, up to N of which may be different index mappings.

Any suitable cryptographic formulation may be used for theformat-preserving encryption and decryption engines 26 and 28, providedthat the cryptographic strength of the encryption algorithm issufficiently strong. With one suitable approach, encryption engine 26and decryption engine 28 use a cryptographic algorithm based on the wellknown Luby-Rackoff construction. The Luby-Rackoff construction is amethod of using pseudo-random functions to produce a pseudo-randompermutation (also sometimes referred to as a block cipher). A diagramshowing how encryption engine 26 and decryption engine 28 may beimplemented using the Luby-Rackoff construction is shown in FIG. 3.

During encryption operations, an unencrypted string is divided into twoportions. The unencrypted string may be divided into two portions usingany suitable scheme. For example, the string may be divided into odd andeven portions by selecting alternating characters from the string forthe odd portion and for the even portion. With another suitableapproach, the unencrypted string is divided into two portions bysplitting the string into left and right halves.

In FIG. 3, the first half of the unencrypted string is labeled “L₁” andthe second half of the unencrypted string is labeled “R₁”. Duringencryption operations with encryption engine 26, the unencrypted stringhalves L₁ and R₁ are processed to form corresponding encrypted stringhalves L₃ and R₂. During decryption operations with decryption engine28, processing flows from the bottom of FIG. 3 towards the top, so thatencrypted string halves L₃ and R₂ are decrypted to produce unencryptedhalves L₁ and R₁. Processing occurs in three rounds 40, 42, and 44.During encryption, the operations of round 40 are performed first, theoperations of round 42 are performed second, and the operations of round44 are performed third. During decryption, the operations of round 44are performed first, the operations of round 42 are performed second,and the operations of round 40 are performed third.

Although shown as involving three rounds in the example of FIG. 3, theoperations of FIG. 3 may, if desired, be implemented using four or morerounds. The use of a three-round block cipher is described as anexample.

The block cipher structure of FIG. 3 encrypts (or decrypts) a string ofa particular known size to produce an output string of the same size.The block cipher uses a subkey generation algorithm 38. The subkeygeneration algorithm 38 has three inputs: a key K, a constant C (C₁ forround 40, C₂ for round 42, and C₃ for round 44), and a string S (S₁=R₁for round 40, S₂=L₂ for round 42, and S₃=R₂ for round 44).

The subkey generation algorithm 38 may be a function H′ that is based ona cryptographic hash function H and that takes as an input S, C, and K.With one suitable approach, the subkey generation algorithm H′ is givenby equation 1.

H′=H(S|C|K)  (1)

In equation 1, the symbol “|” represents the concatenation function. Thecryptographic hash function H is preferably chosen so that the subkeygeneration algorithm has a suitable cryptographic strength. Illustrativecryptographic hash functions that can be used for hash function Hinclude the SHA1 hash function and the AES algorithm used as a hashfunction.

The value of the key K is the same for rounds 40, 42, and 44. The valueof the constant C is different for each round. With one suitablearrangement, the constant C₁ that is used in round 40 is equal to 1, theconstant C₂ that is used in round 42 is 2, and the constant C₃ that isused in round 44 is 3. The value of S varies in each round. In round 40,S₁ is equal to the first half of the unencrypted string R₁. In round 42,S₂ is equal to the L₂. In round 44, S₃ is equal to R₂.

In round 40, the output of the subkey generation algorithm is subkeySK1, as shown in equation 2.

SK1=H(S ₁ |C ₁ |K)  (2)

In round 42, the output of the subkey generation algorithm is subkeySK2, as shown in equation 3.

SK2=H(S ₂ |C ₂ |K)  (3)

In round 44, the output of the subkey generation algorithm is subkeySK3, as shown in equation 4.

SK3=H(S ₃ |C ₃ |K)  (4)

Equations 1-4 involve the use of a cryptographic hash function for thesubkey generation algorithm. If desired, the subkey generation algorithmmay be implemented using a cryptographic message authentication code(MAC) function. A cryptographic message authentication code function isa keyed hash function. Using a cryptographic message authentication codefunction, equation 1 would become H′=MACF(S|C|K), where MACF is themessage authentication code function. An example of a messageauthentication code function is CMAC (cipher-based MAC), which is ablock-cipher-based message authentication code function. Thecryptographic message authentication code function AES-CMAC is a CMACfunction based on the 128-bit advanced encryption standard (AES).

A format-preserving combining operation (labeled “+” in FIG. 3) is usedto combine the subkeys SK1, SK2, and SK3 with respective stringportions.

During encryption operations, format-preserving combining operation 46combines SK1 with string L₁ to produce string L₂. During decryptionoperations, format-preserving combining operation 46 combines SK1 withstring L₂ to produce string L₁. Format-preserving combining operation 48combines SK2 with string R₁ to produce string R₂ during encryptionoperations and combines SK2 with string R₂ to produce string R₁ duringdecryption operations. Format-preserving combining operation 50 is usedto process subkey SK3. During encryption, format-preserving combiningoperation 50 combines SK3 with string L₂ to produce string L₃. Duringdecryption, format-preserving combining operation 50 combines SK3 withstring L₃ to produce string L₂.

The format-preserving combining operation + preserves the format of thestrings L₁, L₂, L₃, R₁, and R₂ as they are combined with the subkeysSK1, SK2, and SK3. For example, the string L₂ that is produced bycombining string L₁ and subkey SK1 has the same format as the string L₁.

The format-preserving combining operation + may be based on any suitablemathematical combining operation. For example, the function + may beaddition mod x or the function + may be multiplication mod x, where x isan integer of an appropriate size (i.e., x=y^(z), where z is equal tothe length of the string S, and where y is equal to the number ofpossible character values for each character in the string S). If, as anexample, the string S contains 16 digits (each digit having one of 10possible values from 0 to 9), x would be 10¹⁶. If the string S containsthree uppercase letters (each uppercase letter having one of 26 possiblevalues from A to Z), x would be 26³. These are merely illustrativeexamples. The format-preserving combining function + may be anyreversible logical or arithmetic operation that preserves the format ofits string input when combined with the subkey.

Illustrative steps involved in setting up the encryption engine 26 anddecryption engine 28 are shown in FIG. 4. At step 52, the desiredformatting to be used for the encrypted and decrypted strings isdefined.

For example, unencrypted strings may be social security numbers thatfollow the format ddd-dd-dddd, where d is a digit from 0 to 9. Theencryption engine 26 may produce corresponding encrypted strings withthe identical format.

As another example, the string format may be dddd dddd dddd dddc, whered is a digit from 0 to 9 and where c is a checksum digit (a digit from 0to 9). The block cipher may be applied to the leading 15 digits of thecredit card number and a checksum value may be recomputed from theencrypted version of the leading 15 digits using the Luhn algorithm.Validity period information may be embedded into the checksum digit byadding a validity period index to the recomputed checksum value. Theindex may, as an example, specify that an index value of 1 correspondsto the year 2006, an index value of 2 corresponds to the year 2007, anindex value of 3 corresponds to the year 2008, etc. If the recomputedchecksum is 3 (as an example), and the validity period for theencryption operation is 2006, the index value of 1 (corresponding toyear 2006) may be added to the checksum value of 3 to produce a checksumdigit of 4 for the ciphertext. In this situation, the final version ofthe encrypted string has the form dddd dddd dddd dddc, where the valueof c is 4. The overall encryption process implemented by the encryptionengine 26 maintains the digit format of the string, because both theunencrypted and encrypted versions of the string contain 16 digits.

The inclusion of additional constraints on the format of the encryptedstring may be necessary to ensure that the encrypted strings are fullycompliant with legacy applications. During step 52, a user decides whichof these ancillary constraints are to be included in the definition ofthe required format for the string.

At step 54, for each character in the string, an index mapping iscreated by defining a set of legal character values and a correspondingindex of sequential values that is associated with the legal charactersvalues. For example, if the legal characters for a particular characterposition in a string include the 10 digits (0 . . . 9) and the 26lowercase letters (a . . . z), a suitable indexing scheme associatesdigits 0 through 9 with index values 1 through 10 and associates lettersa through z with index values 11-36. In this index mapping, the indexvalues that are created are all adjacent. Because there are no gaps inthe indices, index value 10 is adjacent to index value 11 (in thepresent example). If the string contains more than one type ofcharacter, there will be more than one index mapping associated with thecharacters in the string.

At step 56, a value for key K is obtained. The value of K may beobtained, for example, by generating K from a root secret and otherinformation using a key generation algorithm in key server 20.

At step 58, the format-preserving combining operation “+” is defined. Asdescribed in connection with FIG. 3, the format-preserving combiningoperation may be addition modulo x, multiplication modulo x, or anyother suitable logical or arithmetic operation that preserves the formatof the string when combining the string with a subkey and that isreversible.

At step 60, a block cipher structure is selected for the encryptionengine 26 and decryption engine 28. The block cipher structure may, forexample, by a Luby-Rackoff construction of the type described inconnection with FIG. 3. Other suitable block cipher structures may beused if desired.

At step 62, a subkey generation algorithm is selected. Suitable subkeygeneration algorithms include those based on cryptographic hashfunctions such the SHA1 hash function and AES algorithm used as a hashfunction. Suitable subkey generation algorithms also include those builton cryptographic message authentication code functions such as AES-CMAC.

After performing the setup steps of FIG. 4, the encryption engine 26 anddecryption engine 28 can be implemented in system 10 and sensitive datacan be secured.

Illustrative steps involved in using the encryption engine 26 anddecryption engine 28 when processing strings of data in system 10 areshown in FIGS. 5 and 6. As described in connection with FIGS. 1 and 2,the encryption engine 26 and decryption engine 28 may be called by anapplication or may be part of an application 16 that is running on dataprocessing system 10. The data strings that are encrypted and decryptedmay be strings that are retrieved from and stored in fields in adatabase 18 or may be strings that are passed between applications 16(e.g., applications 16 that are running on the same computing equipment12 or that are communicating remotely over a communications network 14).

The flow chart of FIG. 5 shows steps involved in encrypting a datastring.

As shown in FIG. 5, the data string is preprocessed at step 64,encrypted at step 72, and postprocessed at step 74.

At step 66, the encryption engine obtains the unencrypted string. Thestring may be retrieved from a database 18 or received from anapplication 16.

At step 68, the string is processed to identify relevant characters.During step 68, dashes spaces, checksums, and other undesired characterscan be removed from the string and the relevant characters in the stringcan be retained.

For example, if the string is a social security number that containsnine digits separated by two dashes, the string can be processed toremove the dashes. Although the dashes could be left in the string,there is no purpose in encrypting a dash character in the unencryptedstring to produce a corresponding dash character in the encrypted string(as would be required to preserve the format of the entire string).

As another example, if the string being processed is a credit cardnumber containing 16 digits and three spaces, the spaces can be removed.The checksum portion of the 16 digit credit card can be ignored byextracting the 15 leading digits of the credit card number as therelevant characters to be processed further.

At step 70, the encryption engine 26 uses the index mappings that werecreated during step 54 of FIG. 4 to convert the processed string (i.e.,the string from which the irrelevant characters have been removed) intoan encoded unencrypted string. For example, consider a license platenumber in which the first, fifth, sixth, and seventh character positionscontain digits (i.e., numbers from 0 through 9) and the second, third,and fourth character positions contain uppercase letters. An indexmapping may be used to convert the character values in the first, fifth,sixth, and seventh character positions into corresponding index valuesranging from 0 through 9. Another index mapping may be used to convertthe character values in the second, third, and fourth characterpositions into corresponding index values ranging from 0 through 25. Theindex values used in each index mapping may be sequential. Once thecharacters have been encoded using the sequential index values,processing can continue at step 72.

At step 72, the encryption engine 26 encrypts the encoded string usingthe format-preserving block cipher that was established during theoperations of FIG. 4. For example, the encryption engine 26 can performthe Luby-Rackoff encryption operations described in connection with FIG.3. During step 72, the subkey generation algorithm that was selected atstep 62 of FIG. 4 and the format-preserving combining algorithm + thatwas defined at step 58 of FIG. 4 are used to transform the unencryptedencoded string into an encrypted encoded string.

At step 76, the same index mappings that were used during the encodingoperations of step 70 are used to convert the index values of theencrypted string back into characters (i.e., characters in the legal setof character values that were defined for each character position atstep 54). Decoding the encoded version of the string using the indexmappings returns the string to its original character set.

At step 78, the decoded encrypted string is processed to restoreelements such as dashes and spaces that were removed at step 68. Whenreplacing a checksum value, a new valid checksum value can be computedfrom the encrypted version of the string and validity period informationor other suitable information can be embedded within the checksum digit(e.g., by adding a validity period index to the new valid checksum valueto produce a checksum digit for the decoded encrypted string). Thedecoded encrypted string is ciphertext that corresponds to the plaintextunencrypted string that was obtained at step 66. If desired, the entirestring can be encrypted. With this type of arrangement, the checksumremoval operation of step 68 and the checksum digit computationoperation of step 78 can be omitted.

By processing the string at step 78, the extraneous elements of thestring that were removed at step 68 are inserted back into the string.Because the extraneous elements are reinserted into the string andbecause a format-preserving block cipher was used in step 72, theencrypted string that is produced will have the same format as theoriginal unencrypted string. This allows the encrypted string to be usedby applications 16 and databases 18 that require that the originalstring's format be used.

At step 80, the encrypted string is provided to an application 16 ordatabase 18. Legacy applications and databases that require a specificstring format may be able to accept the encrypted string.

Illustrative steps involved in using decryption engine 28 to decrypt astring that has been encrypted using the process of FIG. 5 are shown inFIG. 6. The decryption engine 28 may be invoked by an application 16 ormay be part of an application 16 that is running on data processingsystem 10. The data string that is being decrypted in the process ofFIG. 6 may be an encrypted string that has been retrieved from adatabase 18 or may be a string that has been retrieved from anapplication.

As shown in FIG. 6, the encrypted data string is preprocessed at step82, is decrypted at step 90, and postprocessed at step 92.

At step 84, the decryption engine obtains the encrypted string. Theencrypted string may be retrieved from a database 18 or received from anapplication 16.

At step 86, the encrypted string is processed to identify relevantcharacters. During step 86, dashes spaces, checksums, and otherextraneous elements can be removed from the string. The relevantcharacters in the string are retained. The process of removingextraneous characters during step 86 is the same as that used during theprocessing of the unencrypted string that was performed during step 68of FIG. 5.

If the string being decrypted is a social security number that containsnine digits separated by two dashes, the encrypted string can beprocessed to remove the dashes.

As another example, if the string being processed during step 86 is acredit card number containing 16 digits and three spaces, the spaces canbe removed prior to decryption. The checksum digit of the 16 digitcredit card can be ignored by extracting the 15 leading digits of theencrypted credit card number as the relevant characters to be decrypted.If information is embedded in the checksum digit (e.g., validity periodinformation), the checksum digit may be processed to extract thisinformation during step 86.

At step 88, the decryption engine 26 uses the index mappings that weredefined at step 54 of FIG. 4 and that were used during the encryptionoperations of FIG. 5 to convert each of the characters of the processedencrypted string (i.e., the encrypted string from which the extraneouscharacters have been removed) into an encoded encrypted string. If, asan example, the legal set of characters associated with the firstcharacter of the encrypted string is defined as the set of 10 digits, a10 digit index may be used to encode the first character of theencrypted string. If the legal set of characters associated with thesecond character of the encrypted string is defined as the set of 26uppercase letters, a 26-digit index may be used to encode the secondcharacter of the encrypted string. During step 88, each character of thestring is converted to a corresponding index value using an appropriateindex mapping.

At step 90, the encoded version of the encrypted string is decrypted.The decryption engine 28 decrypts the string using the format-preservingblock cipher that was established during the operations of FIG. 4. Forexample, the decryption engine 26 can perform the Luby-Rackoffdecryption operations described in connection with FIG. 3. During step90, the subkey generation algorithm that was selected at step 62 of FIG.4 and the format-preserving combining algorithm + that was defined atstep 58 of FIG. 4 are used to transform the encrypted encoded stringinto a decrypted encoded string.

At step 94, the index mappings that were used during the encodingoperations of step 88 are used to convert the index values of thedecrypted string back into their associated characters (i.e., charactersin the legal set of character values that were defined for eachcharacter position at step 54). This returns the decrypted string to itsoriginal character set. In strings that contain more than one differenttype of character, multiple different index mappings are used.

At step 96, the decoded decrypted string is processed to restoreelements such as dashes, spaces, and checksum values that were removedat step 88. When replacing a checksum value, a new valid checksum valuemay be computed from the decrypted version of the string. This ensuresthat the decrypted version of the string will be returned to itsoriginal valid state.

During the string processing operations of step 96, the extraneouselements of the string that were removed at step 88 are inserted backinto the string. This restores the string to its original unencryptedstate (i.e., the state of the string when obtained at step 66 of FIG.5).

At step 98, the decrypted string is provided to an application 16 ordatabase 18.

By incorporating format-preserving encryption and decryption engines 26and 28 into data processing system 10, legacy applications and databasesand other applications and databases can be provided with cryptographiccapabilities without disrupting their normal operation.

The key K that is used by encryption and decryption engines 26 and 28may be produced using any suitable technique. For example, key K may besupplied to key server 20 manually and may be distributed to encryptionand decryption engines 26 and 28 in satisfaction of valid key requests.With one particularly suitable arrangement, key K is derivedmathematically from a secret. The secret, which is sometimes referred toas a root secret, may be maintained at key server 20. The root secretmay be supplied to key server 20 manually or may be produced using apseudo-random number generator.

To ensure that keys are only distributed to authorized applications 16,it may be advantageous to mathematically compute each key K from policyinformation 22 (FIG. 1). As an example, key K may be computed by keyserver 20 using equation 5.

K=f(RSECRET,IDEN)  (5)

In equation 5, the parameter IDEN is an identifier, the parameterRSECRET is a root secret, and the function f is a one-way function suchas a hash function. An example of a hash function that may be used forfunction f is the SHA1 hash function. If desired, other hash functionsand one-way functions may be used for function f.

The identifier IDEN may include information that identifies anindividual, a group, a policy, or an application. As an example, theidentifier may be based on the name of an individual, the name of anorganization, the name of a group, or any other suitable user name. Theidentifier may also be based on the name of a policy (e.g., “PCI”indicating that cryptographic operations should be performed inaccordance with payment card industry standards) or may be based on thename of an application. When an application requests key K from keyserver 20, the key server 20 may use all or part of the value of IDEN indetermining whether the key requester is authorized to receive K. If thekey requester is authorized, the function of equation 5 may be used togenerate K.

To support version-based functions in system 10, it may be desirable toallow identities and their associated keys K to expire. Identity and keyexpiration may be implemented by requiring that a validity period beincluded in each identity IDEN. The validity period indicates the dateson which the key K is valid. Validity periods can be expressed in termsof absolute dates, abbreviated dates, version numbers that relate tovalid date ranges or key versions, etc.

One suitable format for the validity period is an expiration date. Forexample, a validity period for IDEN may be made up of a year ofexpiration (e.g., 2007), may be made up of a week of expiration (e.g.,week number 45), may be made up of a month and year of expiration (e.g.,03/2007 or 03/07), etc. Validity periods may also be constructed using adate range (e.g., 2006-2007) during which key K is valid. With onesuitable arrangement for use when encrypting and decrypting creditcards, the validity period in an identity IDEN may be a credit cardexpiration date (e.g., 05/08).

The credit card expiration date or other such information (e.g., arecord locator, cardholder name, etc.) may be combined with informationthat labels the identity IDEN as being associated with credit cards andthe payment card industry (PCI). The value of IDEN might be formed, forexample, by combining the strings “Joe Smith” (the name of a holder of acredit card), “PCI” (indicating the payment card industry), and a creditcard expiration date to form (as an example) a value for IDEN of“JOE_SMITH_PCI_(—)05/08.”

Illustrative steps involved in forming a key K using equation 5 areshown in FIG. 7A.

At step 100, key server 20 obtains the parameter RSECRET (e.g., using apseudorandom number generator operating at key server 20, by retrievingRSECRET from a cache at key server 20, etc.).

At step 102, the key server 20 obtains the parameter IDEN. The parameterIDEN may be provided to key server 20 as part of a key request (e.g., ina single transmission requesting a key or in a series of relatedtransmissions requesting a key). Information such as a user identity(e.g., a username or part of a username, a group identity, etc.),validity period (e.g., an expiration date, a valid date range, a versionnumber, or a combination of such validity period information), andindustry/key type (e.g., “PCI” for the payment card industry) may beincluded in the value of the IDEN string. If desired, components of theIDEN string may be represented using multiple strings or additionalinformation may be included in the IDEN string.

At step 104, key server 20 may use function f of equation 5 (e.g., aSHA1 hash function or other one-way function) to compute K from theknown values of the root secret RSECRET and the identifier IDEN.

Keys may be generated using the operations of FIG. 7A at any suitabletime. For example, key server 20 may generate a key K whenever a validkey request is received. If desired, key server 20 may maintain a keycache in which previously generated keys are stored. Use of a key cachemay reduce the processing burden on key server 20.

A flow chart of illustrative steps involved in generating key K using anapproach in which generated key K is persistently stored is shown inFIG. 7B. With the approach of FIG. 7B, key K is generated randomly atstep 105. For example, key K may be generated using a pseudorandomnumber generator at key server 20 when a key is requested in a keyrequest containing an identifier IDEN.

At step 107, the key K is stored in persistent storage (e.g., a keycache maintained at key server 20). Key server 20 also stores anassociation between the key K that has been generated and the value ofidentifier IDEN from the key request. The association may be provided bymaking an entry in a database that contains the key and the relatedidentifier IDEN (as an example). At a later time, when key K isrequested, key server 20 can retrieve the correct key K from storage tosatisfy the request using the value of the identifier IDEN that isprovided in the key (step 109). The approach of FIG. 7B therefore allowsthe key generator 20 to obtain the key K by generating the key Krandomly (if no key value has been cached) or by retrieving a previouslystored version of key K using the identity value IDEN.

Key server 20 also preferably maintains policy information 22 (FIG. 1).Policy information 22 includes policy rules that may be used indetermining which key requests should be granted and which key requestsshould be denied. An example of a policy rule is a rule that requiresthat a key requester authenticate successfully as part of a PCI LDAP(Lightweight Directory Access Protocol) group whenever the parameterIDEN includes the industry type “PCI.” As another example, a policy rulemight specify that key requests should only be satisfied if made at datethat falls within the validity period specified in the IDEN parameter.Key server 20 may maintain a clock or may otherwise obtain trustworthyexternal information on the current date. External information such asthis may be used by key server 20 in evaluating whether the policy ruleshave been satisfied for a particular key request. In a typical scenario,the policy rules at key server 20 will specify multiple criteria thatmust be satisfied (e.g., proper authentication of a given type must beperformed, a validity period restriction must be satisfied, etc.).

In some situations, authentication server 24 is used in authenticatingkey requesters. In other situations, key server 20 may performauthentication. Key requests may be made by encryption engine 26 when acopy of a key K is needed to perform an encryption operation or bydecryption engine 28 when a copy of key K is needed to perform adecryption operation. In general, any suitable technique may be used toprocess key requests. Flow charts presenting three illustrative ways inwhich key requests for key K may be handled in system 10 are shown inFIGS. 8, 9, and 10.

In the example of FIG. 8, an encryption engine or decryption engineassociated with an application 16 makes a key request to key server 20at step 106. Key requests such as the key request of step 106 may bemade in a single transmission over network 14 between the computingequipment 12 on which the requesting application resides or may be madein multiple associated transmissions. The key request may includeauthentication credentials and an identifier such as the identifierparameter IDEN described in connection with FIGS. 7A and 7B. Theidentifier that is associated with the key request may includeinformation such as a validity period (e.g., a credit card expirationdate), user name, etc. Different types of keys may require differentlevels of authentication. The authentication credentials that areprovided as part of the key request are preferably provided in a formthat is suitable for the type of key being requested. One example ofauthentication credentials is a userID and password. Biometricauthentication credentials may also be used (as an example).

At step 108, key server 20 forwards the authentication credentials thathave been received from the key requester to authentication server 24over communications network 14.

At step 110, authentication server 24 verifies the authenticationcredentials. For example, if the authentication credentials include auserID and password, authentication server 24 may compare the userID andpassword to a list of stored valid userIDs and passwords.

If the authentication server 24 determines that the authenticationcredentials are not valid, the authentication process fails. A suitableresponse to this failure may be generated at step 112. For example,authentication server 24 can notify key server 20 that theauthentication credentials are not valid and can generate suitable alertmessages for entities in system 10. Other suitable actions includegenerating an error message that prompts key server 20 and/or the keyrequester to resubmit the credentials (e.g., to avoid the possibilitythat the authentication failure was due to mistyped authenticationcredentials).

If the authentication server 24 determines that the authenticationcredentials are valid, the authentication server 24 notifies the keyserver 20 accordingly. In a typical scenario, the authentication serverprovides the key server 20 with an “assertion” indicating that thecredentials are valid. The assertion may include information on groupmembership and roles and rights for the authenticated party.

At step 114, key server 20 applies policy rules 22 to the key request.Information such as the identity information IDEN, the authenticationresults from authentication server 24 (e.g., the assertion), andexternal information such as the current date may be used by the keyserver 20 in enforcing the policy rules.

As an example, identity information, authentication results, andexternal information may be used in determining which policy rulesshould be applied. Certain policy rules may be applied when IDENindicates that the key requester is making a “PCI” key request. Suchrules may, as an example, require a particular level of authentication.Certain policy rules may also be applied when a key request is made onparticular times and dates (e.g., more stringent authentication may berequired for evening and weekend key requests). Certain policy rules mayapply to particular groups of users, etc.

In addition to determining which policy rules should be applied, keyserver 20 may also use identity information, authentication results, andexternal information in determining whether the applicable policy ruleshave been satisfied. For example, during step 114, key server 20 maydetermine whether the key request includes valid validity periodinformation (e.g., whether an expiration period has expired). Key server20 may also check to make sure that appropriate valid authenticationresults have been received from authentication server 24, may check thekey requester's membership in a directory group, etc.

If the criteria set forth in the applicable policy rules are notsatisfied, the key request fails and appropriate error notifications maybe generated or other actions may be taken at step 116.

If the applicable policy rules are satisfied, key server 20 may generatea key K to satisfy the key request at step 118. The key K may begenerated using operations of the type shown in FIG. 7A or may begenerated or retrieved using operations of the type shown in FIG. 7B.The key K may then be supplied to the key requester over a secure pathin communications network 14.

In this example, key server 20 applies the applicable policy rules tothe key request following successful verification of the authenticationcredentials by authentication server 24. If desired, the policy rulescan be applied between steps 106 and 108. In this type of scenario, thekey server need not submit the authentication credentials to theauthentication server if the policy rules are not satisfied (e.g., ifvalidity period information indicates that an expiration date haspassed).

Another illustrative technique that may be used by an encryption engineor decryption engine associated with an application to obtain key K isshown in FIG. 9. With this technique, authentication is performed usingauthentication server 24 before the key request is made to key server20.

At step 120, an application 16 that desires a key K providesauthentication credentials to authentication server 24 for verification.If desired, the application may also provide an identifier (e.g.,parameter IDEN) to authentication server 24, which may use thisinformation to determine what type of assertion to provide to theapplication following successful verification of the authenticationcredentials.

At step 122, authentication server 24 verifies the authenticationcredentials. If the authentication credentials are not valid, anappropriate response may be made at step 124 (e.g., by providing theapplication with another chance to provide valid credentials, by issuingan alert, etc.).

If the authentication credentials are determined to be valid, theauthentication server provides the application with an assertion overcommunications network 14. The assertion may be, for example, a Kerberosticket.

At step 126, the application uses the assertion that has been receivedfrom the authentication server in making a key request to key server 20.The key request may include the assertion from authentication server 24and an identifier (e.g., parameter IDEN).

At step 128, the key server applies policy rules 22 to the key requestto determine whether the key request should be satisfied. Key server 20may use identity information (e.g., parameter IDEN, which may include avalidity period), authentication results (e.g., the assertion), andexternal information (e.g., the current date) in determining whichpolicy rules should be applied to the key request. The key server mayalso use this information in determining whether the applicable policyrules have been satisfied. As an example, key server 20 may determinewhether the key request includes valid validity period informationduring step 128 and may check to determine whether the assertion isvalid and sufficient to satisfy the policy rules.

If the applicable policy rules are not satisfied, the key server 20 mayrequest that the application issue a new request or may take othersuitable actions in response to the failure (step 130).

If the key server determines that the applicable key access policy ruleshave been satisfied, the key server may retrieve key K from cache or maygenerate an appropriate key K, as discussed in connection with FIGS. 7Aand 7B. At step 132, the key K may be provided from key server 20 to therequesting application over communications network 14.

With the approach of FIG. 10, authentication operations are performed bykey server 20, so authentication server 24 need not be used.

At step 134, an application that needs key K makes a key request to keyserver 20. The key request may include an identifier (e.g., parameterIDEN) and shared secret information. The shared secret information maybe, for example, a shared secret (i.e., a secret known by theapplication and by the key server) or shared secret information that isderived from the shared secret (e.g., by hashing the shared secret withan identifier such as parameter IDEN).

At step 136, the key server verifies the shared secret information. Thekey server may, as an example, compare the shared secret informationfrom the key request to previously generated and stored shared secretinformation or to shared secret information that is generated in realtime based on the received identity (e.g., IDEN). If the shared secretinformation is valid, the key server can determine which key accesspolicy rules are to be applied to the key request (e.g., using externalinformation such as the current date, using identity information IDEN,etc.). After determining which policy rules to use, key server 20applies the appropriate policy rules to the key request.

If the criteria set forth in the policy rules are not satisfied, the keyrequest fails and appropriate actions can be taken at step 138.

If the policy rules are satisfied, the key server can retrieve key Kfrom cache or may generate key K in real time (e.g., using theoperations of FIGS. 7A and 7B). The requested key may then be providedto the key requester over communications network 12 (step 140).

One of the potential advantages of using key server 20 is that it helpsto avoid problems that might otherwise arise when storing keys in localcache on computing equipment 12. If keys are only maintained in localstorage, it may be difficult to recreate a key when needed to resurrecta server that has crashed. By using key server 20, keys can beregenerated as needed at the key server.

Systems such as system 10 of FIG. 1 may use validity periods to controlwhen keys are valid. A first application may encrypt plaintext using acryptographic key that is based on a given validity period. Theresulting ciphertext may then be stored in a database and retrieved by asecond application or may be provided directly to the second applicationover network 14. The second application must obtain a copy of key K todecrypt the ciphertext. The key K must be generated using the givenvalidity period. If an incorrect validity period is used in generatingK, the value of K will be incorrect and the second application will notbe able to use that value of K to decrypt the ciphertext.

To ensure that applications are properly informed of which validityperiod to use when processing a given data item, the validity period canbe embedded in the data item. When an application needs to determinewhat validity period applies to a particular data item, the validityperiod can be extracted from the data item by the application.

Consider, as an example, credit card numbers. The last digit of a creditcard number is a checksum digit. In a normal valid credit card number,the value of the checksum digit represents a valid checksum that iscomputed based on the preceding numbers of the credit card (i.e., thesixteenth digit in a credit card number is a checksum digit computedfrom the first fifteen digits of the credit card number). The checksumdigit can be used to determine whether a given credit card number isvalid.

Validity period information can be embedded in the credit card number byadding a validity period index to the checksum. With one suitablearrangement, the validity period index matches index values 1 through 9with years 2006, 2007, . . . 2014, respectively. The validity periodindex value for 2006 is 1, the validity period index value of 2represents a validity period of 2007, etc. By combining an appropriatevalidity period index with a checksum number, the validity period can beembedded into the checksum digit and therefore into the credit cardnumber.

Validity period embedding is illustrated in FIG. 11. In the example ofFIG. 11, the valid checksum digit for an unencrypted credit card(plaintext) is 0. Following application of a format-preservingencryption function, the first 15 digits of the credit card number aretransformed into encrypted digits. A new valid checksum can be computedbased on these encrypted digits. In the example of FIG. 11, therecomputed valid checksum is 3. The validity period that is to beembedded into the checksum digit is 2006 (in this example). The indexvalue for validity period 2006 is 1.

As shown in FIG. 11, the validity period can be embedded into thechecksum digit by adding 1 (the index value for 2006) to 3 (thechecksum). The resulting modified checksum digit will be 4. When thismodified checksum digit is used in the ciphertext version of the creditcard number, it will not represent a valid checksum for the ciphertextversion of the credit card number. However, applications will be able toextract the validity period from the ciphertext, obviating the need tokeep track of the validity period separately.

FIG. 12 is a diagram showing how a credit card numbering scheme withembedded validity period information may be implemented in a system suchas system 10 of FIG. 1. As shown by line 142, a first application 16-1may receive plaintext such as a credit card number. The plaintext may bemanually input into application 16-1 by an operator, may be receivedfrom another application, etc.

Application 16-1 encrypts the plaintext to form ciphertext. As indicatedby line 144, application 16-1 may request a copy of a key K from keyserver 20. If application 16-1 is authorized, key server 20 will providethe requested key K to application 16-1 (line 146). Application 16-1encrypts the plaintext using encryption engine 26 and key K (line 148)to produce ciphertext. As part of the encryption operation, application16-1 can embed validity period information into the ciphertext.

Application 16-1 can store the ciphertext in database 18 for subsequentretrieval by application 16-2 (lines 150 and 152). Alternatively,application 16-1 can provide the ciphertext to application 16-2 directly(line 154). Application 16-2 extracts the validity period informationfrom the ciphertext and uses this validity period information inrequesting an appropriate key K for decrypting the ciphertext (line156). If authorized, key server 20 provides the requested key toapplication 16-2 (line 158). The key is used in decryption engine 28 byapplication 16-2 to decrypt the ciphertext, producing plaintext (line160). The plaintext may be used by application 16-2 or otherapplications in system 10 to which application 16-2 provides theplaintext.

Illustrative steps involved in encrypting and decrypting credit cardnumbers in an arrangement of the type shown in FIG. 12 in which validityperiod information is embedded in credit card checksum digits are shownin FIG. 13.

At step 162 of FIG. 13, application 16-1 obtains an unencrypted(plaintext) credit card number. The credit card number has a validchecksum digit (e.g., the 16th digit out of 16 digits in the credit cardnumber).

An encryption engine 26 associated with application 16-1 requires a keyK to encrypt the credit card number. The application 16-1 thereforeobtains a key K (step 164). Suitable techniques for obtaining key K aredescribed in connection with FIGS. 8, 9, and 10. With one suitablearrangement, application 16-1 provides key server 20 with a key requestthat contains an identity IDEN containing a validity period. Key server20 may generate key K using equation 5.

At step 166, application 16-1 (e.g., encryption engine 26 at application16-1) removes the original checksum from the plaintext (e.g., to producea 15 digit string) and encrypts the string from which the checksum digithas been removed using a format-preserving encryption function of thetype described in connection with FIG. 3. The format-preservingencryption operation uses the key K and the 15-digit string (number) asinputs and produces an encrypted 15-digit string (number) as an output(in this example).

At step 168, the application 16-1 (e.g., encryption engine 26) computesa new valid checksum from the 15-digit encrypted string. The application16-1 then embeds validity period information in the checksum digit. Anysuitable technique may be used to mathematically combine the validityperiod information and the valid checksum. With one suitable approach, avalidity period index is created (e.g., an index value of 1corresponding to a validity period of 2006, etc.) and this validityperiod index value is added to the checksum to produce a checksum digitinto which the validity period information has been embedded. Ifdesired, other suitable mathematical functions may be used to embed thevalidity period information into the checksum (or into other redundantinformation in a string). The ciphertext that results from theprocessing of step 168 includes a leading 15 digits of encrypted creditcard data followed by a single checksum digit into which the validityperiod information has been embedded.

At step 170, the ciphertext is provided from application 16-1 toapplication 16-2 through a database 18 or direct transfer.

At step 172, application 16-2 receives the ciphertext version of thecredit card number. Application 16-2 separates the portion of theciphertext that does not include the embedded validity period from theciphertext (e.g., application 16-2 separates the leading 15 digits ofthe ciphertext from the checksum digit). Application 16-2 computes avalid checksum for the leading 15 digits. Application 16-2 uses thenewly computed valid checksum to extract the validity period from thechecksum digit. The mathematical function that is used to extract thevalidity period reverses the embedding process used at step 168. Forexample, if the validity period was embedded into the checksum digit byadding the validity period to the checksum at step 168, application 16-2subtracts the newly computed valid checksum from the checksum digit atstep 172 to reveal the embedded validity period index value.

At step 174, application 16-2 uses the validity period that has beenextracted from the checksum digit in obtaining a copy of key K. Inparticular, application 16-2 may formulate a key request for key server20 that includes the extracted validity period. If application 16-2 isauthorized, key server 20 may use the validity period in generating thekey K (see, e.g., equation 5) and may provide the requested key K toapplication 16-2.

At step 176, application 16-2 may use decryption engine 28 to decryptthe ciphertext. Decryption engine 28 uses the key K and the ciphertextas inputs and produces the original plaintext version of the credit cardnumber as an output. During decryption, decryption engine 28 may apply aformat-preserving decryption algorithm of the type described inconnection with FIG. 3 to decrypt the first 15 digits of the ciphertextversion of the credit card to produce 15 corresponding plaintext creditcard digits. The decryption engine 28 may also compute a valid checksumfor the 15 decrypted digits and may append the valid checksum to the 15decrypted digits to produce a complete 16-digit plaintext credit cardnumber having a valid checksum.

The format-preserving cryptographic functions of encryption engine 26and decryption engine 28 may be used to facilitate software testing.

As shown in FIG. 14, a typical production environment 178 has multipleapplications 16 that access a common database 18. During normal use ofapplications 16 in a production system, applications 16 access data 182.Data 182 may be provided in the form of one or more tables, some ofwhich may contain sensitive data items (e.g., credit card numbers,etc.).

Before applications such as applications 16 are released into generaluse in production environment 178, testing is performed in a testenvironment 180. As shown in FIG. 14, the system of test environment 180has applications 16 and a database 18 that are similar to those inproduction environment 178. Data 184 (e.g., tables of data of the typestored in database 18 in production environment 178) is stored indatabase 18 of test environment 180. However, the data 184 in testenvironment 180 is generally less secure than the data 182 in productionenvironment 178. This is because test environments typically lack thesophisticated security measures (strong firewalls, up-to-date antivirussoftware, etc.) that are found in production environments.

Although test environments such as test environment 180 are often lesssecure than normal production environments such as environment 178, itis generally desired to test the applications 16 in test environment 180using realistic data (e.g., credit card numbers and other data itemsthat have appropriate string lengths and character values characteristicof valid real data). With conventional testing arrangements, a testenvironment database is created by copying a production environmentdatabase. This may expose sensitive information such as credit cardnumbers to attacks in the test environment.

To ensure that sensitive data is not exposed to attacks, formatpreserving encryption engine 26 is used to encrypt the data in table182. The encrypted data 184 may be exported to a database in testenvironment 180. Any suitable amount of data may be encrypted andexported in this way. For example, the entire contents of database 18 inproduction environment 18 may be encrypted prior to exporting theencrypted data to database 18 in test environment 180. If desired, onlysensitive data may be encrypted (e.g., credit card numbers and socialsecurity number), while less sensitive data is not encrypted (or isencrypted or obscured using less sophisticated techniques).

The encrypted data may be encrypted on a field-by-field basis. Forexample, credit card fields in the production database may beindividually encrypted using format-preserving encryption engine 26 toproduce encrypted credit card numbers in the test database. Because theencrypted credit card numbers have the same format as the unencryptedcredit card numbers, accurate testing of applications 16 and database 18can be performed in test environment 180. Because sensitive informationis encrypted before it is exported to the test environment, the risk ofexposing sensitive data to attackers in the test environment issignificantly reduced.

It may be desirable to selectively grant applications access todifferent parts of a data string. Consider the example of a credit cardnumber. As shown in FIG. 15, a sixteen digit credit card number mayinclude three parts. The leading six digits of the credit card number(plaintext part P1) are sometimes referred to as the bank identificationnumber (BIN). The next six digits of the credit card number (plaintextpart P2) are sometimes referred to as the account number core and formpart of the credit card holder's account number. The last four digits ofthe credit card number (plaintext part P3) are sometimes referred to asuser account number information and are used with the account numbercore to identify a credit card holder's account.

Different parties may be entitled to access different parts of thecredit card number. Some parties may only need access to the BIN. Otherparties may require access to the entire credit card number. As aresult, it may be desirable to selectively grant access to differentportions of the credit card number to different parties.

As shown in FIG. 16, selective access may be accomplished by usingencryption engine 26 to encrypt part P1 with key K1, producing encryptedpart P1 (ciphertext C1). Encryption engine 26 may encrypt part P2 withkey K2 to produce encrypted part P2 (ciphertext C2). Part P3 may beencrypted with the encryption engine using key K3, producing encryptedpart P3 (ciphertext C3). Encryption engine 26 may then be used toencrypt C1, C2, and C3 together using key K4 to produce credit cardciphertext (i.e., ciphertext for all of parts P1, P2, and P3 together).Keys K1, K2, K3, and K4 may be four independent cryptographic keys.Encryption with key K4 helps to prevent matching attacks (e.g., attacksin which an attacker attempts to gather information on the unencryptedcredit card numbers by noting when values of C2 are identical for twodifferent credit card numbers). Selective decryption may be performed bydecrypting the ciphertext with key K4 and then decrypting a selected oneof C1, C2, and C3 using an appropriate one of K1, K2, and K3.

In some situations, it may be desirable to grant different parties orapplications with access to portions of a data string (e.g., P1, P2, andP3 in the present example) according to their sensitivity. In a creditcard number, part P1 is considered less sensitive than part P2 and partP2 is considered less sensitive than part P3. Parts P1, P2, and P3 cantherefore be ranked according to their sensitivity, with P1 being theleast sensitive and with P3 being the most sensitive.

With one suitable encryption scheme, which is shown in FIG. 17,plaintext part P1 is encrypted first. With this scheme, encryptionengine 26 uses a key K1 to produce encrypted P1 (ciphertext C1). Engine26 uses a one-way function H such as a hash function (e.g., SHA1) tocompute H(C1), which is combined with a key BASE K2 to produce a key K2.The function + in FIG. 17 represents any suitable combining functionsuch as concatenation or addition. The key K2 is used to encryptplaintext part P2, producing encrypted part P2 (ciphertext C2). Thisrandomizes P2 relative to P1. Once C2 has been computed, engine 26 maycompute H(C2) and may combine H(C2) with key BASE K3 to produce key K3.Engine 26 may then use key K3 to encrypt plaintext part P3. Thisproduces ciphertext C3 and randomizes P3 relative to parts P1 and P2.

When it is desired to provide access to part P1 without providing accessto part P2, an application may be provided with key K1, but not keys K2and K3. Selective access to part P2 can be granted by providing anapplication with key K2. Key K3 can be provided to an application thatdesires selective access to part P3. Encryption can be provided usingonly three encryption operations, rather than the four encryptionoperations used in the approach of FIG. 16. Moreover, decryption of anygiven part can be performed in a single step, rather than using twosteps. To produce P1, it is only necessary to decrypt C1 with K1.Similarly, decryption engine 28 can produce P2 by decrypting C2 with K2and can produce P3 by decrypting C3 with K3.

With the technique of FIG. 17, matching attacks on part P1 are accepted,which is possible because the BIN number portion of a credit card numberis not generally considered highly sensitive. Parts P2 and P3, which aremore sensitive than part P1, are secure against matching attacks.

The foregoing is merely illustrative of the principles of this inventionand various modifications can be made by those skilled in the artwithout departing from the scope and spirit of the invention.

What is claimed is:
 1. A method for testing applications that access atest database in a test environment before using the applications toaccess a production database in a production environment, comprising:encrypting sensitive data in the production database using aformat-preserving encryption algorithm; and exporting the encrypted datafrom the production database to the test database; and testing theapplications by using the applications in the test environment to accessthe encrypted data in the test database.
 2. The method defined in claim1, wherein encrypting the sensitive data in the production databaseusing the format-preserving encryption algorithm comprises encryptingcredit card numbers in the production database using theformat-preserving encryption algorithm.
 3. The method defined in claim2, wherein encrypting the credit card numbers in the production databaseusing the format-preserving encryption algorithm comprises: obtaining anunencrypted credit card number at the production database; and removinga checksum digit from the unencrypted credit card number.
 4. The methoddefined in claim 3, wherein encrypting the credit card numbers in theproduction database using the format-preserving encryption algorithmfurther comprises: obtaining a cryptographic key; and with an encryptionengine implemented on computing equipment, encrypting the unencryptedcredit card number from which the checksum digit was removed using thecryptographic key to produce an encrypted version of the unencryptedcredit card number from which the checksum digit was removed;
 5. Themethod defined in claim 4, wherein encrypting the credit card numbers inthe production database using the format-preserving encryption algorithmfurther comprises: computing a new valid checksum for the encryptedversion.
 6. The method defined in claim 5, wherein encrypting the creditcard numbers in the production database using the format-preservingencryption algorithm further comprises: embedding a key selector into achecksum digit by combining the new valid checksum and the key selector.7. The method defined in claim 6, wherein encrypting the credit cardnumbers in the production database using the format-preservingencryption algorithm further comprises: adding the checksum digit to theencrypted version to produce ciphertext corresponding to the unencryptedcredit card number.
 8. The method defined in claim 1, wherein theproduction database includes less sensitive data that is less sensitivethan the sensitive data, the method further comprising: exporting theless sensitive data from the production database to the test databasewithout encrypting the less sensitive data.
 9. The method defined inclaim 8, wherein encrypting the sensitive data in the productiondatabase using the format-preserving encryption algorithm comprisesencrypting credit card numbers in the production database using theformat-preserving encryption algorithm.
 10. The method defined in claim9, wherein testing the applications by using the applications in thetest environment to access the encrypted data in the test databasecomprises accessing the encrypted data in the test database withoutdecrypting the encrypted data.
 11. The method defined in claim 1,wherein testing the applications by using the applications in the testenvironment to access the encrypted data in the test database comprisesaccessing the encrypted data in the test database without decrypting theencrypted data.